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We construct integral and supremum type goodness-of-fit tests for the family of power 
distribution functions. Test statistics are functionals of [/—empirical processes and are 
based on the classical characterization of power function distribution family belonging to 
Puri and Rubin. We describe the logarithmic large deviation asymptotics of test statis- 
tics under null-hypothesis, and calculate their local Bahadur efficiency under common 
i— i" parametric alternatives. Conditions of local optimality of new statistics are given. 
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1 Introduction. 

>: 

Testing goodness-of-fit for parametric families of distributions remains one of important and 

interesting statistical problems. Let V be the family of power function distributions with the 

distribution functions (d.f.) 

|> : F(x) = x\ x E (0,1), A > 0. (1.1) 

O 

It is the member of the beta family and is the "inverse" of Pareto distribution. Power function 
distribution often appears in applications, e.g. in the study of service periods of queueing 
systems [I], in the economic models of lead-time and pricing [19], and in the reliability of 
^ ■ electric systems [H] . 

We are interested in goodness-of-fit tests for this family which are independent of unknown 
parameter A. As far as we know, the only attempt to build such tests has been traced by 
Martynov (9], [10] who proposed to use the well-known Durbin's approach [3] based on the 
empirical process with estimated parameters. 

In this paper we develop completely different way introducing and analyzing two tests based 
on characterization of the power function distribution. Consider the following characterization 
by Puri and Rubin [T7] : 

Let X and Y be i.i.d. non-negative random variables. Then the equality in law of X and 
min(y-, takes place iff X has some d.f. from the family V . 

It should be noted that this result has been obtained by means of monotonic transformation 
from the characterization of exponentiality obtained in [17]. This is a common and traditional 
method to restate the characterization theorems. However, as noted in [5], p. 169], "while a 
property can be interesting for one distribution, it may lose its appeal after a transformation." 
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But we find the characterization of the power function distribution family stated above rather 
convenient for goodness-of-fit purposes. 

Let Xi, X 2 , ... be i.i.d. observations with the continuous d.f. F. We are interested in testing 
the hypothesis H : F G V against the general alternative Hi : F ^ V, assuming, however, that 
the alternative d.f. is also concentrated on (0, 1). 

Let F n {t) = n~ l YH=\ < t},t & R 1 , be the usual empirical d.f. based on the sample 
X±, . . . , X n . According to the Puri- Rubin characterization we introduce the so-called [/-empirical 
di, see 0, ®, by 

(oV Y. l{min(^,^)<t}, fe(0,l). 



2 ^ ^ 1K X~/X~ 

- 7 l<i<i<n 3 

Consider two statistics which can be used for testing H against Hi : 

T PR 



I™= / (H n (t)-F n (t))dF n (t), (1.2) 

D™ = sup | H n (t) - F n (t) | . (1.3) 
te[o,i] 

The first of this statistics is of integral type and resembles the classical cu^-statistic while the 
second is of Kolmogorov type. We will describe their limit distributions under H and we will 
calculate their local Bahadur efficiency under certain parametric alternatives. To this end we 
need their rough large deviation asymptotics under H . Moreover, we will discuss the conditions 
of their local optimality in the Bahadur sense. 

For basic information on Bahadur theory we refer to PQ, [2] and [12J. This type of efficiency 
is most pertinent in our problem as the Kolmogorov type statistics have non-normal distribution 
and hence the Pitman approach is not applicable. 

In Bahadur theory the measure of efficiency of the sequence of statistics {T n } is the exact 
slope ct{@) describing the exponential decrease rate of ther P— values under the alternative. It 
is well-known (it is the so-called Bahadur- Raghavachari inequality [I], [12]) that always 

c T (6) < 2K(6), 

where K(9) is the Kullback-Leibler "distance" between the null-hypothesis and the alternative 
which is indexed by real parameter 9. Therefore we may define the local Bahadur efficiency as 

eff(T):=Yimc T (9)/2K(9). 

2 Statistic I™ 

The statistic l£ R is asymptotically equivalent to the [/-statistic of degree 3 with the centred 
kernel 

* PR (X, Y,Z)= 1 - (l{min(p |) < Z} + l{min(|, |) < Y} + l{min(|, |) < X}) - ~. 
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Note that both statistics I RR and D% R under Hq are invariant with respect to the change 
of variable X — > X 1 ^. Therefore we may take A = 1, i.e. we can assume that the initial sample 
is uniform on (0, 1). 

It is well-known, see, e.g. j6], [8] that non-degenerate U- and ^-statistics are asymptotically 
normal. To prove that the kernel *&pp(X,Y, Z) is non-degenerate, let calculate its projection 
ipPR. For fixed X = s we have 

^ PR (s) := E(V PR (X, Y,Z)\X = s) = |p{min(p, j) < Z} + ^P{min(|, y) < s} - l -. 
First probability can be evaluated as follows: 

s y r 1 s r v i 

P{min( — , — ) < Z} = 1 - / - dy- - dy = 1 + sin s - -s, 
Y s J s y J s 2 

and it results from the above characterization that 

Y z 

P{min(— , — ) < s} = F{Y < s} = s, < s < 1. 
Zj y 

Hence we get the final expression for the projection of the kernel: 

i>p R {s) = l + lslns, 0<s<l. (2.1) 
The variance of the projection is given by 

5 



A PR = [ ^PR( s ) ds 
JO 



972' 



and is positive. Hence, the kernel typp(X, Y, Z) is non-degenerate. Due to Hoeffding's theorem 

The kernel ^ pp is centred, non-degenerate and bounded. Applying the theorem on large 
deviations for non-degenerate [/-statistics from [15], see also [2], [13], we get: 

Theorem 2.1. For a > it holds true that 

lim n~ l lnP(/^ R > a) = -/(a), 

where the function f is analytic for sufficiently small a > 0, and that 

1 J ISA* fl 5 

In case of uniform null-distribution, and more generally, for the power function distribution, 
there are no accepted standard alternatives. Therefore we consider in this paper three alterna- 
tives: the contamination alternative and two other unnamed alternatives concentrated on (0, 1). 
The expressions of these alternative d.f.'s axe cis follows: 

Gi{x, 9) = (I- 0)x + 9x r , 0<9<1, r > 1, x E (0, 1). 

G 2 (x,9) = x -9sm{7Tx), < 9 < 1/vr, x G (0, 1). 

G 3 (x, 0) = x + 9 J X (| + lylny) dy, < 9 < 1, x e (0, 1). 
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The formulas for corresponding densities gj(x,9),j = 1,2,3 are straightforward. 

We will need in the sequel the expressions as 9 — > of the Kullback-Leibler "distance" 
between the null-hypothesis and the considered alternatives. Note that the null-hypothesis is 
the composite one. We will establish now some general form for this distance as 9 — > 0. 

Lemma 2.2. Let g(x, 9) be any alternative density on (0, 1) which is sufficiently regular so that 
any differentiation under the sign of integral in the proof is justifiable and the Kullback-Leibler 
information \2. B) is well-defined. Put 



Then 

2' 



2K{9) ~ 9 2 



(g (x, 0)) 2 dx — (J g' d (x,0)\nxdx 



(2.2) 



9^0. (2.3) 



Proof. The infimum in (I2.2p is attained for A = — (J Q g(x, 9) \nxdx) 1 and equals 

K(9)= f g(x,9)\ng(x,9)dx+ f g(x, 9) Inxdx + ln(- / g(x, 9) \nxdx) + 1. (2.4) 
Jo Jo Jo 

As 9 the function K{9) has the following form: 

K{9) ~ K(0) + K'(0) ■ 9 + ^K"(0) ■ 9 2 . 

It is easy to see that K(0) = and that K'(0) = 0. 

Differentiating in 9 two times the right-hand side of (12 Ah we get 

K"{9) = g^(x,9)(l + lng(x,9)+lnx)dx + J 9e ^f^ dx+ 

f 1 ,, , s ( f 1 , , ( \lg'n{x,9)\nxdx 

+ / g'^(x,9)lnxdx / g(x,9)\nxdx) - J \ 0K ' 

Jo \Jo J \ j g(x, 9) hixdx 

Substituting 9 = 0, one obtains the required expression. □ 

Let calculate the local Bahadur exact slope and the local efficiency of the sequence of statis- 
tics I^ R for the alternative d.f. G(x, 9) and the density g(x, 9) assuming their regularity and 
the possibility of differentiating under the integral sign. These conditions are valid for all three 
alternatives we consider. Denote also h(x) = g' g (x,0). Note that f Q h(x)dx = 0. 

According to the Law of Large Numbers for [/-statistics [8] the limit in probability of the 
sequence I PR under any such alternative is equal as 9 — > to 

bj{9) = P e (min(^,^) < Z)- 1 - = 2 g(z,9)dz J' g(y,9)G(yz,9)dy-~ ~ J(0) + J'(0)-9. 
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It is easy to see that J(0) = 0, while J'(0) = 2 h(z)zdz + 2 J* dz dy J^ z h(x)dx. Chan- 
ging two times the order of integration, we get 



dz dy h(x)dx = dz h(x) (l J dx = 

Jo Jo Jo Jo Jo \ zJ J 



h(x) (x In x — x) dx. 



It follows therefore that ^ 

6/(0; PR) ~ 39 [ if) PR (x)h(x)ds. (2.5) 
Jo 

Contamination alternative. After elementary calculations we get by (12. 5p as 9 — > that 
bi{9) ~ (r — l) 2 /2(r + l) 2 • 9. Therefore the local exact slope of the sequence of statistics I PR 
as 9 — > admits the representation 



108 ,2^. Dm 27(r-l) 



CI {9- PR) ~ — 6^61; Pi?) = 5( ; - ^ 9 
It is easy to show using (12 .3p that for the alternative d.f. G\ 



j-2 



2K(9) ~ ^ g2 ^ _> o. (2.6) 

r^(2r — 1) 



Hence the local Bahadur efficiency of our test is equal to 

e//(r;0 = li,„ C '"' iPfl »- 27 ' 2r - 1 ) r2 



e->o 2K(9) 5(r + l) 4 

This efficiency is reasonably high for moderate values of r, its maximum is attained for 
r = 2 + y/3 and equals 0.970. 

Second alternative The calculation of local Bahadur efficiency in the case of alternative 
G 2 is quite similar. We have by (12. 5p bj(9; PR) ~ 0.224 • 9 2 , so that the local exact slope of I PR 
as 9 —7- admits the representation c/(0; Pi?) ~ 1.083 • # 2 . 

According to ( 12. 3p . the Kullback-Leibler information in this case satisfies 

2K(9) ~ 1.505 • 9 2 , 9 -> 0. (2.7) 

Consequently the local Bahadur efficiency of our test is ef f(I PR ) = 0.719. 

Third alternative. In the case of the third alternative the calculations are alike, and we 
obtain after some calculations that ci(9; PR) ~ ^ as 9 — > 0. The Kullback-Leibler information 
also satisfies in this case the relation 

56> 2 

2K(9) ~ _ _> 0. (2.8) 



Therefore, the local Bahadur efficiency is equal to 1, and the integral test is locally optimal 
in Bahadur sense (T2J Ch. 6]. We will return to the cause of this phenomenon in the last section. 



6 



Table 2.1. Local Bahadur efficiency for the statistic /, 



Alternative 


Efficiency 


G 1 


0.970, r « 3.7 


G 2 


0.719 


G 3 


1. 000 



3 Statistic D™ 

Now we consider the Kolmogorov type statistic (jl.3j) . In this case for fixed t the difference 
H n [t) — -F n (t) is a family of [/-statistics with the kernels 

~ PR (X, Y; t) = l{min(£, |) < t} - ^1{X < f} - ~1{Y < t}, 
depending on t e (0, 1). The projection of this kernel for fixed t e (0, 1) has the form 

£p R (s; t) := E (E PR (X, Y\ t) \X = s) = P{mm(^, j)<t}- \±{s <t}- ^F{Y < t}. 

After easy calculations we get 

Zpr(s; t) = 1{ 8 <t }(l-i) + st- ~t. (3.1) 

Now let calculate the variance 5 PR (t) of this projection. We have after some simple calcula- 
tions 

S 2 PR (t) := Ee PR (X i; t) = lt(l + t - 2t 2 ), < t < 1. (3.2) 

It is easy to see that the supremum of the function 5p R (t) is attained in the point t* = 1+ 6 V ^ 
and equals 5p R ~ 0.044. Hence our family of kernels Ep R (X; t) by [13] is non-degenerate. 

The limiting distribution of the statistic D^ R is unknown. Using the mehods developed in 
one can show that the [/-empirical process 



Vn{t) = Vn(H n (t) - F n (t)) , t e (0, 1), 

converges weakly as n — > oo to some centered Gaussian process rj(t) with complicated covari- 
ance. Then the sequence of statistics ^JriD^ R converges in distribution to the random variable 
sup t |^(t)| whose distribution we are not able to find. Hence we suggest to use statistical mod- 
elling to evaluate the critical values for the statistics D RR . 

The family of kernels {Ep R (X, Y\ t)}, t G (0,1) is not only centred but bounded. Using 
the results of [13] on large deviations of families of non-degenerate [/-statistics, we obtain the 
following result. 

Theorem 3.1. For sufficiently small a > it holds true that 

lim n~ l \n¥{D RR > a) = -k(a), 
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where the function k is analytic, and moreover 



a 2 



k(a) = ^2~(1 + o(l)) ~ 2.84 a 2 , as a 0. 



Contamination alternative. Let calculate the local Bahadur slope and local efficiency of 
the statistic (11. 3p for the alternative d.f. 0). By Glivenko-Cantelli theorem for L/-empirical 

d.f.'s [7] the limit of D RR almost surely under any alternative is equal as 9 —¥ to 



b D (t,9;PR) = sup |2 / g(y,9)G(ty,9)dy-G(t,< 
o<t<i Jo 



(3.3) 



Assuming the regularity of the alternative d.f., we can deduce 



b D (t,9;PR) ~2 sup | / £pji(a;t)/i(s)cZs| ■ 0, 

0<t<l Jo 



(3.4) 



where £p P (s; t) is from (13.11) . Applying this formula we get for our alternative 



D 2 



b D (t, 9; PR) ~ + ^ r -~9, 9^0. 
Hence, the local exact slope of the sequence of statistics D RR as 9 — > admits the representation 



5.68(r - l) 4 r^ 
(r + 1 



2r 



c fl (^;Pi?) ~ " \ ■ ' 9 2 . 



The Kullback-Leibler information satisfies (12. 6p . Hence the local Bahadur efficiency of our test 
is equal to 

^ _ 5.68(2r - l)r-^r 



eff(r;D< 



(r + 1) 



It can be shown that the maximal value of the local efficiency for the sequence {D RR } is 
attained for r = 4.64 and is equal to 0.636 while its values for 3 < r < 12 are larger than 0.5. 

Second alternative. The calculation of local Bahadur efficiency in the case of alternative 
G?2 is quite similar. We have as 9 — > by ( 13. 4ft bo(9; PR) ~ 0.367 • 9. Therefore the local exact 
slope of D RR admits the asymptotics cd{9; PR) ~ 0.765 ■ 9 2 . The Kullback-Leibler information 
in this case is given by (12. 7p . Hence the local Bahadur efficiency of our test is eff(D PR ) = 0.508. 

Third alternative. In this case we get for the alternative d.f. G^(x,9) as 9 — > that 
6d(#; PR) ~ 0.0249 • 9. Hence the local exact slope of the sequence of statistics D RR as 9 — > 
admits the representation cd{9;PR) ~ 0.00352 ■ 9 2 . We know that the Kullback-Leibler infor- 
mation in this case satisfies (12. 8p . Thus the local Bahadur efficiency of our test is equal to 0.685. 

It is seen that the Kolmogorov statistic is less efficient than the integral statistic I RR as 
usually in goodness-of-fit testing [T2] . 
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Table 3.2. Local Bahadur efficiency for statistic D\ 



Alternative 


Efficiency 


G x 


0.636, for r « 4.64 


G 2 


0.508 


G 3 


0.685 



4 Conditions of local asymptotic optimality 

In this section we are interested in conditions of local asymptotic optimality (LAO) in Bahadur 
sense for both sequences of statistics I RR and D RR . This means to describe the local structure 
of the alternatives for which the given statistic has maximal potential local efficiency so that 
the relation 

c T (9) ~ 2K(9), 9^0, 

holds, see [12], [16]. Such alternatives form the domain of LAO for the given sequence of statistics. 
Consider the functions 



H(x) = G' e (x, 9) | fl=0 , h(x) = g' g (x, 9) 



=o 



We will assume that the following regularity conditions are true, see also 



/ h 2 (x)dx < oo where h(x) = H'(x), (4.1) 
Jo 

d f 1 f 1 

g(x, 9)q(x)dx \g =0 = / h(x)q(x)dx Vg 6 Li(0, 1). (4.2) 



99 jo Jo 

Denote by Q the class of densities g(x, 9) with d.f.'s G(x, 9), satisfying the regularity conditions 
(14. ip - (14.2j) . We are going to deduce the LAO conditions in terms of the function h(x). 

For alternative densities from Q the arguments of Lemma [272] are true, hence the asymptotics 

2K{9) ~ |jT h 2 (x)dx-^ h(x)\nxdx^j J 9 2 , 9^0, 

is valid. 

First consider the integral statistic I PR with the kernel ^>pn(x, y, z) and its projection 
iPpr( x ) — | + fxlnx. Let introduce the auxiliary function 



h (x) = h(x) — (lnx + 1) / \nuh(u)du 



oo 



Simple calculations show that 



h 2 (x)dx — I / h(x) In xdx J = / h'^(x)dx, 

n \Jo / JO 

1 f l 

ippR,(x)h(x)dx = / ippp{x)ho(x)dx, 
o Jo 

/ £pii(x;t)h(x)dx = / ^pp(x;t)ho(x)dx for any t 6 (0,1). 
Jo Jo 
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Hence the local asymptotic efficiency by (12. 5B takes the form 

e//(J™) = Jm 6?(0; Pi?)/ (9A* fl • 2tf(0)) = 

"^/Vr^^o^) 6 ^ / ip PR (x)dx ■ J hl(x)dx 

By Cauchy-Schwarz inequality we obtain that the expression in the right-hand side is equal 
to 1 iff ho(x) = Ciippn(x) for some constant C\ > 0, so that h(x) = Ciippp(x) + C^lna; + 1) for 
some constants C\ > and Ci- The set of distributions for which the function h(x) has such 
form generate the domain of LAO in the class Q. The example of such alternative is the density 
g(x, 9) which for small 9 > satisfies the formula 

g(x,6) = 1 + 9 Q + |xIn:rY <x < 1. (4.3) 

This explains why the third alternative leads to asymptotic optimality of the test based on J„ . 
It is in perfect agreement with the findings of the paper [T4] where similar problems were solved 
for the simple null-hypothesis. 

Now let consider the Kolmogorov type statistic D PR with the family of kernels Sp^(X, Y; t) 
and their projections £,pr(x; t) = l{x < — |) +xt — |. In this case it is easy to see that the 
following asymptotics is true: 

b D {9- PR) ~ 29 sup | / £ PR (x;t)h (x)dx \ . (4.4) 
te(o,i] Jo 

Hence the local efficiency takes the form 



eff(D PR ) = lim 



b 2 D {9-PR)/ sup (45 2 PR (t))-2K(9) 

te(o,i) 



= sup ( / £,pr(x; t)h (x)dx J / sup ( / £p R (x,t)dx ■ / h^{x)dx j < 1. 
te(o,i) V^o / te(o,i) Vio Jo / 

We can apply once again the Cauchy-Schwarz inequality to the integral in (14. 4p . It follows 
that the sequence of statistics D RR is locally asymptotically optimal, and eff(D PR ) = 1 iff 
h(x) = C^PR(x } to) + C 4 (lnx + 1) for t = argsup t6 (- 01 ) Sp R (t) = and some constants 

C3 > and C4. The distributions with such h(x) form the domain of LAO in the class Q. The 
simplest example of such alternative density g(x, 9) which for small 9 > is given by the formula 

g{x, 9) = 1 + 9 (l{x < t } Q - ^ + xt - |^ , < x < 1, where t = (4.5) 

Hence we see that there exist special alternative densities (14. 3 j) and (14. 5 p of relatively simple 
form for which our sequences of statistics are locally asymptotically optimal. This stresses their 
merits and potential utility. 
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